AITopics | Ibadan

An explosion of high-throughput DNA sequencing in the past decade has led to a surge of interest in population-scale inference with whole-genome data.

artificial intelligence, machine learning, neural network, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
North America > United States > Utah (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

'I invested in a Ponzi scheme': Nigerians fall victim to crypto scams

Al JazeeraJun-11-2025, 09:31:17 GMT

Mandela Fadahunsi, who works at a technical training school in Ikeja in Nigeria's Lagos, never believed he could fall victim to a Ponzi scheme. On April 6, the 26-year-old was starting his day when a WhatsApp notification lit up his phone screen. Someone on the group chat for investors of the cryptocurrency investment platform, Crypto Bridge Exchange (CBEX), had tried and failed to withdraw some funds, so they wanted to confirm if it was a general issue. Fadahunsi quickly logged on to his digital wallet and tried to withdraw 500 USDT, a cryptocurrency that stands for United States Dollar Tether, or simply Tether. But 24 hours later, a process that should have taken just 10 minutes was yet to complete.

artificial intelligence, ponzi scheme, social media, (17 more...)

Al Jazeera

Country:

North America > United States (0.49)
Africa > Nigeria > Lagos State > Ikeja (0.24)
Africa > Nigeria > Oyo State > Ibadan (0.04)
Africa > Nigeria > Lagos State > Lagos (0.04)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > e-Commerce > Financial Technology (0.55)
Information Technology > Communications > Social Media (0.35)

Add feedback

Browsing Lost Unformed Recollections: A Benchmark for Tip-of-the-Tongue Search and Reasoning

CH-Wang, Sky, Deshpande, Darshan, Muresan, Smaranda, Kannappan, Anand, Qian, Rebecca

arXiv.org Artificial IntelligenceMar-24-2025

We introduce Browsing Lost Unformed Recollections, a tip-of-the-tongue known-item search and reasoning benchmark for general AI assistants. BLUR introduces a set of 573 real-world validated questions that demand searching and reasoning across multi-modal and multilingual inputs, as well as proficient tool use, in order to excel on. Humans easily ace these questions (scoring on average 98%), while the best-performing system scores around 56%. To facilitate progress toward addressing this challenging and aspirational use case for general AI assistants, we release 350 questions through a public leaderboard, retain the answers to 250 of them, and have the rest as a private test set.

information retrieval, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2503.19193

Country:

Africa > Nigeria > Oyo State > Ibadan (0.05)
Asia > Singapore (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Add feedback

CardioTabNet: A Novel Hybrid Transformer Model for Heart Disease Prediction using Tabular Medical Data

Sumon, Md. Shaheenur Islam, Islam, Md. Sakib Bin, Rahman, Md. Sohanur, Hossain, Md. Sakib Abrar, Khandakar, Amith, Hasan, Anwarul, Murugappan, M, Chowdhury, Muhammad E. H.

arXiv.org Artificial IntelligenceMar-22-2025

The early detection and prediction of cardiovascular diseases are crucial for reducing the severe morbidity and mortality associated with these conditions worldwide. A multi-headed self-attention mechanism, widely used in natural language processing (NLP), is operated by Transformers to understand feature interactions in feature spaces. However, the relationships between various features within biological systems remain ambiguous in these spaces, highlighting the necessity of early detection and prediction of cardiovascular diseases to reduce the severe morbidity and mortality with these conditions worldwide. We handle this issue with CardioTabNet, which exploits the strength of tab transformer to extract feature space which carries strong understanding of clinical cardiovascular data and its feature ranking. As a result, performance of downstream classical models significantly showed outstanding result. Our study utilizes the open-source dataset for heart disease prediction with 1190 instances and 11 features. In total, 11 features are divided into numerical (age, resting blood pressure, cholesterol, maximum heart rate, old peak, weight, and fasting blood sugar) and categorical (resting ECG, exercise angina, and ST slope). Tab transformer was used to extract important features and ranked them using random forest (RF) feature ranking algorithm. Ten machine-learning models were used to predict heart disease using selected features. After extracting high-quality features, the top downstream model (a hyper-tuned ExtraTree classifier) achieved an average accuracy rate of 94.1% and an average Area Under Curve (AUC) of 95.0%. Furthermore, a nomogram analysis was conducted to evaluate the model's effectiveness in cardiovascular risk assessment. A benchmarking study was conducted using state-of-the-art models to evaluate our transformer-driven framework.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.17664

Country:

Europe > Switzerland (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
Asia > Middle East > Kuwait (0.04)
(8 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(3 more...)

Add feedback

NaijaNLP: A Survey of Nigerian Low-Resource Languages

Inuwa-Dutse, Isa

arXiv.org Artificial IntelligenceMar-6-2025

With over 500 languages in Nigeria, three languages -- Hausa, Yor\`ub\'a and Igbo -- spoken by over 175 million people, account for about 60% of the spoken languages. However, these languages are categorised as low-resource due to insufficient resources to support tasks in computational linguistics. Several research efforts and initiatives have been presented, however, a coherent understanding of the state of Natural Language Processing (NLP) - from grammatical formalisation to linguistic resources that support complex tasks such as language understanding and generation is lacking. This study presents the first comprehensive review of advancements in low-resource NLP (LR-NLP) research across the three major Nigerian languages (NaijaNLP). We quantitatively assess the available linguistic resources and identify key challenges. Although a growing body of literature addresses various NLP downstream tasks in Hausa, Igbo, and Yor\`ub\'a, only about 25.1% of the reviewed studies contribute new linguistic resources. This finding highlights a persistent reliance on repurposing existing data rather than generating novel, high-quality resources. Additionally, language-specific challenges, such as the accurate representation of diacritics, remain under-explored. To advance NaijaNLP and LR-NLP more broadly, we emphasise the need for intensified efforts in resource enrichment, comprehensive annotation, and the development of open collaborative initiatives.

arxiv preprint arxiv, dataset, naijanlp, (14 more...)

arXiv.org Artificial Intelligence

2502.19784

Country:

Africa > Niger (0.14)
Africa > Cameroon (0.14)
Africa > Nigeria > Jigawa State > Dutse (0.05)
(29 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.46)
Media > News (0.46)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(3 more...)

Add feedback

Bridging Gaps in Natural Language Processing for Yor\`ub\'a: A Systematic Review of a Decade of Progress and Prospects

Jimoh, Toheeb A., De Wille, Tabea, Nikolov, Nikola S.

arXiv.org Artificial IntelligenceFeb-24-2025

Natural Language Processing (NLP) is becoming a dominant subset of artificial intelligence as the need to help machines understand human language looks indispensable. Several NLP applications are ubiquitous, partly due to the myriads of datasets being churned out daily through mediums like social networking sites. However, the growing development has not been evident in most African languages due to the persisting resource limitation, among other issues. Yor\`ub\'a language, a tonal and morphologically rich African language, suffers a similar fate, resulting in limited NLP usage. To encourage further research towards improving this situation, this systematic literature review aims to comprehensively analyse studies addressing NLP development for Yor\`ub\'a, identifying challenges, resources, techniques, and applications. A well-defined search string from a structured protocol was employed to search, select, and analyse 105 primary studies between 2014 and 2024 from reputable databases. The review highlights the scarcity of annotated corpora, limited availability of pre-trained language models, and linguistic challenges like tonal complexity and diacritic dependency as significant obstacles. It also revealed the prominent techniques, including rule-based methods, among others. The findings reveal a growing body of multilingual and monolingual resources, even though the field is constrained by socio-cultural factors such as code-switching and desertion of language for digital usage. This review synthesises existing research, providing a foundation for advancing NLP for Yor\`ub\'a and in African languages generally. It aims to guide future research by identifying gaps and opportunities, thereby contributing to the broader inclusion of Yor\`ub\'a and other under-resourced African languages in global NLP advancements.

african language, dataset, yor, (14 more...)

arXiv.org Artificial Intelligence

2502.17364

Country:

North America > United States (0.14)
Africa > Niger (0.05)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(37 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.68)

Industry:

Information Technology (0.46)
Education (0.46)
Media (0.45)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
(5 more...)

Add feedback

Approximating Hierarchical MV-sets for Hierarchical Clustering

Assaf Glazer, Omer Weissbrod, Michael Lindenbaum, Shaul Markovitch

Neural Information Processing SystemsFeb-9-2025, 13:05:46 GMT

The goal of hierarchical clustering is to construct a cluster tree, which can be viewed as the modal structure of a density. For this purpose, we use a convex optimization program that can efficiently estimate a family of hierarchical dense sets in high-dimensional distributions. We further extend existing graph-based methods to approximate the cluster tree of a distribution. By avoiding direct density estimation, our method is able to handle high-dimensional data more efficiently than existing density-based approaches. We present empirical results that demonstrate the superiority of our method over existing ones.

artificial intelligence, cluster tree, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Utah (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > Scotland (0.04)
(7 more...)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Creating Artificial Students that Never Existed: Leveraging Large Language Models and CTGANs for Synthetic Data Generation

Khalil, Mohammad, Vadiee, Farhad, Shakya, Ronas, Liu, Qinyi

arXiv.org Artificial IntelligenceJan-3-2025

In this study, we explore the growing potential of AI and deep learning technologies, particularly Generative Adversarial Networks (GANs) and Large Language Models (LLMs), for generating synthetic tabular data. Access to quality students data is critical for advancing learning analytics, but privacy concerns and stricter data protection regulations worldwide limit their availability and usage. Synthetic data offers a promising alternative. We investigate whether synthetic data can be leveraged to create artificial students for serving learning analytics models. Using the popular GAN model CTGAN and three LLMs- GPT2, DistilGPT2, and DialoGPT, we generate synthetic tabular student data. Our results demonstrate the strong potential of these methods to produce high-quality synthetic datasets that resemble real students data. To validate our findings, we apply a comprehensive set of utility evaluation metrics to assess the statistical and predictive performance of the synthetic data and compare the different generator models used, specially the performance of LLMs. Our study aims to provide the learning analytics community with valuable insights into the use of synthetic data, laying the groundwork for expanding the field methodological toolbox with new innovative approaches for learning analytics data generation.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2501.01793

Country:

Europe > Norway > Western Norway > Vestland > Bergen (0.05)
South America > Uruguay > Maldonado > Maldonado (0.04)
South America > Brazil (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Educational Setting > Online (0.68)
Education > Educational Technology > Educational Software > Computer Based Training (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Generalized Grade-of-Membership Estimation for High-dimensional Locally Dependent Data

Chen, Ling, Huang, Chengzhu, Gu, Yuqi

arXiv.org Machine LearningDec-27-2024

This work focuses on the mixed membership models for multivariate categorical data widely used for analyzing survey responses and population genetics data. These grade of membership (GoM) models offer rich modeling power but present significant estimation challenges for high-dimensional polytomous data. Popular existing approaches, such as Bayesian MCMC inference, are not scalable and lack theoretical guarantees in high-dimensional settings. To address this, we first observe that data from this model can be reformulated as a three-way (quasi-)tensor, with many subjects responding to many items with varying numbers of categories. We introduce a novel and simple approach that flattens the three-way quasi-tensor into a "fat" matrix, and then perform a singular value decomposition of it to estimate parameters by exploiting the singular subspace geometry. Our fast spectral method can accommodate a broad range of data distributions with arbitrarily locally dependent noise, which we formalize as the generalized-GoM models. We establish finite-sample entrywise error bounds for the generalized-GoM model parameters. This is supported by a new sharp two-to-infinity singular subspace perturbation theory for locally dependent and flexibly distributed noise, a contribution of independent interest. Simulations and applications to data in political surveys, population genetics, and single-cell sequencing demonstrate our method's superior performance.

inequality, matrix, probability, (16 more...)

arXiv.org Machine Learning

2412.19796

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Utah (0.04)
Africa > Nigeria > Oyo State > Ibadan (0.04)
(5 more...)

Genre: Research Report (0.81)

Industry:

Government > Regional Government (0.67)
Government > Immigration & Customs (0.46)
Health & Medicine > Therapeutic Area (0.45)
Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback